--- title: Ebola Outbreak in DRC 2018 author: Michelle date: '2018-05-15' slug: drc-ebola categories: [] tags: - R - leaflet - Ebola description: '' ---
An outbreak of Ebola occurred in the Equator province of the Democratic Republic of Congo in April 2018. The WHO published its first Disease Outbreak News on the outbreak on May 10th and began efforts to conduct ring vaccination around the area. You can read more about it on the WHO’s website on the 2018 Ebola outbreak in the DRC.
Since the start of the outbreak, Caitlin Rivers, an Assistant Professor at the John Hopkins Center for Health Security has been digitizing the WHO Disease Outbreak Network situations reports and the DRC Ministry of Health mailing list reports and posting the data on her github repository.
I’ve put together some visualizations to explore the data here. This is more of an exploration of the data than a tutorial in how to do so, but all of the code is provided if you want to take a shot at it yourself.
library(tidyr)
library(sp)
library(leaflet)
library(rgdal)
library(RCurl)
library(ggplot2)
library(dplyr)
library(plotly)
All of the data is in a public github repository that can be accessed within R, making it easy to update by simply rendering the document. You need the RCurl package to read the csv file. To read in a github csv file you need click on the “Raw” button to get the url of the raw text file.
who.data <- read.csv(text = getURL( "https://raw.githubusercontent.com/cmrivers/ebola_drc/master/who/data.csv"), stringsAsFactors = F, header = T)
who.data$source <- "WHO"
drc.data <- read.csv(text = getURL("https://raw.githubusercontent.com/cmrivers/ebola_drc/master/drc/data.csv"), stringsAsFactors = F, header = T)
drc.data$source <- "MOH"
According to the WHO website, this includes the confirmed, probable, and suspected cases. Deaths are split with an additional category for health care workers. As of 2018-05-14, the data was being reported as cumulative, although the MOH reports also include a category for new cases, and deaths may not be cumulative.
head(who.data)
## event_date report_date province health_zone confirmed_cases
## 1 20180101 20180511 Equateur Bikoro NA
## 2 20180505 20180511 Equateur Bikoro NA
## 3 20180507 20180511 Equateur Bikoro 2
## 4 20180509 20180510 Equateur Bikoro 2
## 5 20180511 20180511 Equateur Bikoro 2
## 6 20180511 20180514 Equateur Wangata NA
## confirmed_deaths probable_cases probable_deaths suspect_cases
## 1 NA NA NA 21
## 2 NA NA NA 26
## 3 NA NA NA 24
## 4 NA 16 NA 12
## 5 NA 18 NA 14
## 6 NA 2 NA NA
## suspect_deaths cumulative_hcw contacts_traced source
## 1 17 NA NA WHO
## 2 NA NA NA WHO
## 3 NA NA NA WHO
## 4 18 3 NA WHO
## 5 18 3 75 WHO
## 6 NA NA 43 WHO
head(drc.data)
## event_date report_date health_zone confirmed_cases new_confirmed
## 1 20180501 20180510 Bikoro NA NA
## 2 20180510 20180510 Bikoro 2 NA
## 3 20180511 20180511 Bikoro 2 NA
## 4 20180511 20180511 Iboko 0 NA
## 5 20180512 20180512 Bikoro 2 NA
## 6 20180512 20180512 Iboko 0 NA
## probable_cases new_probable suspect_cases new_suspect deaths source
## 1 NA NA 21 NA 17 MOH
## 2 NA NA 9 NA 1 MOH
## 3 NA NA 6 NA 0 MOH
## 4 NA NA 6 NA 1 MOH
## 5 NA NA 6 NA 0 MOH
## 6 NA NA 6 NA 1 MOH
As of June 9 2018, the Ministry of Health Data seems to be more current (goes until 2018-06-06) so I’m just going to focus on that.
ebola.data <- drc.data %>%
mutate(Date = as.Date(as.character(event_date), format = "%Y%m%d")) %>%
#selet columns of interest
select(-starts_with("new")) %>%
gather(type, number, confirmed_cases:deaths) %>%
select(Date, health_zone, type, number)
The epidemic curve shows the number of cases per reporting period. Here, I plot the cumulative cases, which is how the CDC and MOH report them.
p <- ebola.data %>%
dplyr::select(-health_zone) %>%
dplyr::filter(type != "deaths") %>%
#shrink labels by wrapping
mutate(labels = gsub("_", "\n",type)) %>%
#order labels
mutate(labels = factor(labels, levels = c("suspect\ncases", "probable\ncases",
"confirmed\ncases"))) %>%
ggplot(., aes(x=Date, y = number)) +
geom_bar(stat="identity", aes(fill = labels), color = NA) +
theme_minimal() +
ylab("No. of cases")+
scale_fill_brewer(palette = "Set2", type = "qual", name = "Reporting Type")
ggplotly(p) %>%
layout(margin = list(r=100))
## Warning: Removed 7 rows containing missing values (position_stack).
As of June 9 2018, the growth in cases has slowed and the slope is almost zero, suggesting the epidemic has burned out.
The cases are reported by health zone. You can download a shapefile or GeoJSON of the DRC’s health zones here.
When this outbreak started, most digital data of the healthzones of the DRC was wrong (including my own!). You can read about how a group of Congolese cartographers worked to update and correct these maps in this Atlantic article by Ed Yong. The WHO and CDC have since updated its maps, but (as of June 9) I haven’t been able to find the correct shapefile freely available online.
health.zones <- readOGR("../../static/data/drc-ebola", "healthZones2", stringsAsFactors = F)
## OGR data source with driver: ESRI Shapefile
## Source: "/Users/mvevans/Dropbox/git/mevansblog/static/data/drc-ebola", layer: "healthZones2"
## with 515 features
## It has 2 fields
moh.zone <- ebola.data %>%
group_by(health_zone) %>%
#get lastest reporting date
filter(Date == max(Date)) %>%
#spread data to work with leaflet
spread(type, number) %>%
#calculate total cases
mutate(total_cases = sum(suspect_cases, probable_cases, confirmed_cases)) %>%
select(Name = health_zone, suspect_cases, probable_cases, confirmed_cases, total_cases, deaths)
moh.map <- merge(health.zones, moh.zone, by = "Name", all.x = T)
#fix NAs
moh.map@data[is.na(moh.map@data)] <- 0
#create color palette
pal <- colorNumeric("YlOrRd", domain = moh.map$total_cases)
labels.moh <- sprintf(
"<strong>%s</strong><br/>%g Total Cases, %g Deaths (DRC MOH)",
moh.map$Name, moh.map$total_cases, moh.map$deaths
) %>% lapply(htmltools::HTML)
leaflet(moh.map) %>%
# basemap
addProviderTiles(providers$OpenStreetMap.HOT) %>%
addPolygons(color = "gray08",
fillOpacity = 0.5,
fillColor = ~pal(total_cases),
label = labels.moh,
highlightOptions = highlightOptions(color = "#444444", weight = 2,
bringToFront = TRUE)) %>%
# add legend
addLegend(pal = pal, values = ~total_cases, opacity = 0.7, title = "Total\nCases",
position = "bottomright") %>%
# set custom zoom
setView(lng = 17.1, lat = -1.3, zoom = 7)